The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
尽管存在能够在许多医疗数据集上表现出很好的语义分割方法,但是通常,它们不设计用于直接用于临床实践。两个主要问题是通过不同的视觉外观的解开数据的概括,例如,使用不同的扫描仪获取的图像,以及计算时间和所需图形处理单元(GPU)存储器的效率。在这项工作中,我们使用基于SpatialConfiguration-Net(SCN)的多器官分段模型,该模型集成了标记器官中的空间配置的先验知识,以解决网络输出中的虚假响应。此外,我们修改了分割模型的体系结构,尽可能地减少其存储器占用空间,而不会急剧影响预测的质量。最后,我们实现了最小的推理脚本,我们优化了两者,执行时间和所需的GPU内存。
translated by 谷歌翻译
Neural network (NN) potentials promise highly accurate molecular dynamics (MD) simulations within the computational complexity of classical MD force fields. However, when applied outside their training domain, NN potential predictions can be inaccurate, increasing the need for Uncertainty Quantification (UQ). Bayesian modeling provides the mathematical framework for UQ, but classical Bayesian methods based on Markov chain Monte Carlo (MCMC) are computationally intractable for NN potentials. By training graph NN potentials for coarse-grained systems of liquid water and alanine dipeptide, we demonstrate here that scalable Bayesian UQ via stochastic gradient MCMC (SG-MCMC) yields reliable uncertainty estimates for MD observables. We show that cold posteriors can reduce the required training data size and that for reliable UQ, multiple Markov chains are needed. Additionally, we find that SG-MCMC and the Deep Ensemble method achieve comparable results, despite shorter training and less hyperparameter tuning of the latter. We show that both methods can capture aleatoric and epistemic uncertainty reliably, but not systematic uncertainty, which needs to be minimized by adequate modeling to obtain accurate credible intervals for MD observables. Our results represent a step towards accurate UQ that is of vital importance for trustworthy NN potential-based MD simulations required for decision-making in practice.
translated by 谷歌翻译
Neural information retrieval (IR) systems have progressed rapidly in recent years, in large part due to the release of publicly available benchmarking tasks. Unfortunately, some dimensions of this progress are illusory: the majority of the popular IR benchmarks today focus exclusively on downstream task accuracy and thus conceal the costs incurred by systems that trade away efficiency for quality. Latency, hardware cost, and other efficiency considerations are paramount to the deployment of IR systems in user-facing settings. We propose that IR benchmarks structure their evaluation methodology to include not only metrics of accuracy, but also efficiency considerations such as a query latency and the corresponding cost budget for a reproducible hardware setting. For the popular IR benchmarks MS MARCO and XOR-TyDi, we show how the best choice of IR system varies according to how these efficiency considerations are chosen and weighed. We hope that future benchmarks will adopt these guidelines toward more holistic IR evaluation.
translated by 谷歌翻译
We propose a routing algorithm that takes a sequence of vectors and computes a new sequence with specified length and vector size. Each output vector maximizes "bang per bit," the difference between a net benefit to use and net cost to ignore data, by better predicting the input vectors. We describe output vectors as geometric objects, as latent variables that assign credit, as query states in a model of associative memory, and as agents in a model of a Society of Mind. We implement the algorithm with optimizations that reduce parameter count, computation, and memory use by orders of magnitude, enabling us to route sequences of greater length than previously possible. We evaluate our implementation on natural language and visual classification tasks, obtaining competitive or state-of-the-art accuracy and end-to-end credit assignments that are interpretable.
translated by 谷歌翻译
The upcoming exascale era will provide a new generation of physics simulations. These simulations will have a high spatiotemporal resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible. Therefore, we need to rethink the training of machine learning models for simulations for the upcoming exascale era. This work presents an approach that trains a neural network concurrently to a running simulation without storing data on a disk. The training pipeline accesses the training data by in-memory streaming. Furthermore, we apply methods from the domain of continual learning to enhance the generalization of the model. We tested our pipeline on the training of a 3d autoencoder trained concurrently to laser wakefield acceleration particle-in-cell simulation. Furthermore, we experimented with various continual learning methods and their effect on the generalization.
translated by 谷歌翻译
Media has a substantial impact on the public perception of events. A one-sided or polarizing perspective on any topic is usually described as media bias. One of the ways how bias in news articles can be introduced is by altering word choice. Biased word choices are not always obvious, nor do they exhibit high context-dependency. Hence, detecting bias is often difficult. We propose a Transformer-based deep learning architecture trained via Multi-Task Learning using six bias-related data sets to tackle the media bias detection problem. Our best-performing implementation achieves a macro $F_{1}$ of 0.776, a performance boost of 3\% compared to our baseline, outperforming existing methods. Our results indicate Multi-Task Learning as a promising alternative to improve existing baseline models in identifying slanted reporting.
translated by 谷歌翻译
Dysgraphia, a handwriting learning disability, has a serious negative impact on children's academic results, daily life and overall wellbeing. Early detection of dysgraphia allows for an early start of a targeted intervention. Several studies have investigated dysgraphia detection by machine learning algorithms using a digital tablet. However, these studies deployed classical machine learning algorithms with manual feature extraction and selection as well as binary classification: either dysgraphia or no dysgraphia. In this work, we investigated fine grading of handwriting capabilities by predicting SEMS score (between 0 and 12) with deep learning. Our approach provide accuracy more than 99% and root mean square error lower than one, with automatic instead of manual feature extraction and selection. Furthermore, we used smart pen called SensoGrip, a pen equipped with sensors to capture handwriting dynamics, instead of a tablet, enabling writing evaluation in more realistic scenarios.
translated by 谷歌翻译
我们提出的方法可以并联有效分类分类。我们的方法将与语义树中给定的节点相对应的一批分类分数和标签转换为与与祖先路径中所有节点相对应的分数和标签硬件加速器。我们在当前的硬件加速器上实现了我们的方法,并用一棵树结合了WordNet 3.0中的所有英语综合体,涵盖了20级的深度,涵盖117,659个类。我们将一批分数和标签转换为各自的祖先路径,从而产生可忽略不计的计算,并且在数据的足迹上仅消耗固定的0.04GB内存。
translated by 谷歌翻译
为了了解材料特性的起源,三轴光谱仪(TAS)处的中子散射实验通过测量其动量(Q)和能量(E)空间中的强度分布来研究样品中的磁和晶格激发。但是,TAS实验的高需求和有限的光束时间可用性提出了自然的问题,即我们是否可以提高其效率或更好地利用实验者的时间。实际上,使用TAS,有许多科学问题需要在Q-E空间的特定区域中搜索感兴趣的信号,但是当手动完成时,这是耗时且效率低下的,因为测量点可能会放置在此类的无信息区域中作为背景。主动学习是一种有前途的通用机器学习方法,可以迭代地检测自主信号的信息区域,即不受人类干扰,从而避免了不必要的测量并加快实验。此外,自主模式允许实验者在此期间专注于其他相关任务。我们在本文中描述的方法利用了对数高斯过程,由于对数转换,该过程在信号区域中具有最大的近似不确定性。因此,将不确定性最大化为采集功能,因此直接产生了信息测量的位置。我们证明了我们方法对在Themal Tas Eiger(PSI)进行真实中子实验的结果的好处,以及在合成环境中基准的结果,包括许多不同的激发。
translated by 谷歌翻译